A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data.
نویسندگان
چکیده
Identification of clinically relevant tumor subtypes and omics signatures is an important task in cancer translational research for precision medicine. Large-scale genomic profiling studies such as The Cancer Genome Atlas (TCGA) Research Network have generated vast amounts of genomic, transcriptomic, epigenomic, and proteomic data. While these studies have provided great resources for researchers to discover clinically relevant tumor subtypes and driver molecular alterations, there are few computationally efficient methods and tools for integrative clustering analysis of these multi-type omics data. Therefore, the aim of this article is to develop a fully Bayesian latent variable method (called iClusterBayes) that can jointly model omics data of continuous and discrete data types for identification of tumor subtypes and relevant omics features. Specifically, the proposed method uses a few latent variables to capture the inherent structure of multiple omics data sets to achieve joint dimension reduction. As a result, the tumor samples can be clustered in the latent variable space and relevant omics features that drive the sample clustering are identified through Bayesian variable selection. This method significantly improve on the existing integrative clustering method iClusterPlus in terms of statistical inference and computational speed. By analyzing TCGA and simulated data sets, we demonstrate the excellent performance of the proposed method in revealing clinically meaningful tumor subtypes and driver omics features.
منابع مشابه
Gender-based Differences in Associations between Attitude and Self-esteem with Smoking Behavior among Adolescents: A Secondary Analysis Applying Bayesian Nonparametric Functional Latent Variable Model
Background: Different patterns of gender-based relationships between attitude toward smoking and self-esteem with smoking behavior have reported. However, such associations may be much more complex than a simply supposed linear relationship. We aimed to propose a method of providing hand details on the total and gender-based scenarios of the relationships between attitude toward smoking and sel...
متن کاملBayesian Analysis of Survival Data with Spatial Correlation
Often in practice the data on the mortality of a living unit correlation is due to the location of the observations in the study. One of the most important issues in the analysis of survival data with spatial dependence, is estimation of the parameters and prediction of the unknown values in known sites based on observations vector. In this paper to analyze this type of survival, Cox...
متن کاملThe Analysis of Bayesian Probit Regression of Binary and Polychotomous Response Data
The goal of this study is to introduce a statistical method regarding the analysis of specific latent data for regression analysis of the discrete data and to build a relation between a probit regression model (related to the discrete response) and normal linear regression model (related to the latent data of continuous response). This method provides precise inferences on binary and multinomia...
متن کاملSparse Integrative Clustering of Multiple Omics Data Sets.
High resolution microarrays and second-generation sequencing platforms are powerful tools to investigate genome-wide alterations in DNA copy number, methylation, and gene expression associated with a disease. An integrated genomic profiling approach measuring multiple omics data types simultaneously in the same set of biological samples would render an integrated data resolution that would not ...
متن کاملApplication of Bayesian Latent Variable Model for Early Detection of Gestational Diabetes Mellitus Without A Perfect Reference Standard Test by β‐human Chorionic Gonadotropin
Background and Objectives: Gestational diabetes mellitus (GDM) is a medical problem in pregnancy, and its late diagnosis can cause adverse effects in the mother and fetus. The purpose of this research was to estimate the accuracy parameters of a biomarker for early prediction of gestational diabetes in the absence of a perfect reference standard test. Methods: This study was conducted in 52...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Biostatistics
دوره 19 1 شماره
صفحات -
تاریخ انتشار 2018